home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Libris Britannia 4
/
science library(b).zip
/
science library(b)
/
MATHEMAT
/
STATISTI
/
0910.ZIP
/
OCTA.DOC
< prev
next >
Wrap
Text File
|
1987-01-15
|
16KB
|
449 lines
OCTA
Oneman's Contingency Table Analysis
(c) 1987
"One of many STATOOLS(tm)..."
by
Gerard E. Dallal
53 Beltran Street
Malden, MA 02148
(O)neman's (C)ontingency (T)able (A)nalysis is a program
for the log-linear analysis of multi-dimensional contingency
tables by use of the Deming-Stephan iterative proportional
fitting procedure described in Bishop, Fienberg, and Holland
(1975) and Fienberg (1980).
DISCLAIMER
STATOOLS(tm) are provided "as is" without warranty of
any kind. The entire risk as to the quality, performance,
and fitness for intended purpose is with you. You assume
responsibility for the selection of the program and for the
use of results obtained from that program.
PAGE 2
Oneman's Contingency Table Analysis (OCTA) is a program
for the log-linear analysis of multi-dimensional contingency
tables by use of the Deming-Stephan iterative proportional
fitting procedure described in Bishop, Fienberg, and Holland
(1975) and, for a more general audience, Fienberg (1980).
Only the first two letters of a command need be entered.
ENTER -- enter contingency table
TABLE -- get table from OCTA system file
SAVE -- save table in OCTA system file
EDIT -- edit contingency table
MODEL -- specify model
ALL -- all models (three way tables only)
PM -- compute partial and marginal associations
ISOLATE -- isolated cell; structural zero; quasi-independence
DELTA -- specify a constant to be added to all cell counts
INITIAL -- specify initial values
MARGIN -- construct marginal table
STRATUM -- extract a stratum
SP -- select printing options
PRINT -- print
MESSAGE -- message
QUIT -- quit
DATA MANAGEMENT
The ENTER command initiates data entry. Tables may be
saved by use of the SAVE command and recovered, during the
same or subsequent sessions, by use of the TABLE command.
The EDIT command can be used to change any portion of the
table.
FITTING MODELS
The MODEL command is used to fit hierarchical log-linear
models to the table. A model is specified by entering its
minimal set of sufficient configurations. To obtain these
configurations, write down the terms of the log-linear model
OCTA G.E. Dallal
PAGE 3
and delete any term whose indices are a proper subset of the
indices of any other term in the model. The collection of
indices of those terms that remain form the minimal set of
sufficient configurations for the model.
A model must be typed on a single line with
configurations separated by one or more blanks or commas.
The MODEL command generates its own prompt, MODEL>. OCTA
keeps returning to this prompt until an empty line (Return)
is entered.
Example: consider the four-dimensional table with
factors W, U, S, and B when the model to be fitted
contains all two factor interactions and the USB three
factor interaction. The model contains the terms
W U S B WU WS WB US UB SB USB
W is eliminated as a subset of WU, US is eliminated as
a subset of USB, and so on, leaving
WU WS WB USB
as the minimal set of sufficient configurations.
These configurations are entered at the MODEL> prompt:
OCTA>MODEL
MODEL>WU WS WB USB
.
. [output]
.
MODEL> <Return>
OCTA>
The fitting procedure aborts automatically if a fitted
value of zero is obtained.
The ALL command fits all possible models to a three-
dimensional table.
OCTA G.E. Dallal
PAGE 4
CONVERGENCE
Iteration stops whenever the largest absolute difference
between an observed and fitted entry in the set of sufficient
configurations is less than 0.01 or the largest ratio of the
absolute difference to the observed entry is less than
0.0001.
PARAMETER ESTIMATES AND THEIR STANDARD ERRORS
Parameters of the log-linear model are estimated by
applying the usual analysis of variance methods to the logs
of the expected cell counts. The reported standard errors
are the asymptotic mle's of the errors for the parameters of
a saturated model, that is, a model containing all terms of
all orders. Lee (1977) suggests that these are often likely
to be be overestimates of the true standard errors. These
estimates are undefined if any of the cell counts is zero.
PARTIAL AND MARGINAL ASSOCIATION
The PM command gives tests of partial and multiple
association (Brown, 1976). To illustrate for the USB
interaction in the example above, the test of partial
association is the difference between the likelihood ratio
statistic for the model with all three factor interactions
except USB and the likelihood ratio statistic for the model
with all three factor interactions. The test for marginal
association is the difference between the likelihood ratio
statistic for the model with sufficient configurations US,
UB, SB and the likelihood ratio statistic for the model with
sufficient configuration USB, that is, the model with all
effects implied by USB other than USB itself and the model
with all effects implied by USB.
If both tests are significant, then the term is probably
needed to build an adequate model. If both tests are
nonsignificant, the term can probably be safely eliminated.
If one test is significant while the other is not, further
investigation is required.
The PM command gives no warning when a fitted value of 0
is obtained.
OCTA G.E. Dallal
PAGE 5
QUASI-INDEPENDENCE
The ISOLATE command is used to fit quasi-independence
models. Upon entry of the ISOLATE command, OCTA asks for the
indices of the cells to which fitted models will not apply.
Such cells are called isolated or separated.
The calculation of the degrees of freedom of the
reference distribution for the likelihood ratio statistics
for these models is complex (Fienberg, 1980, section 8.3),
They are a function of both the pattern of isolated cells and
the particular model under consideration. The degrees of
freedom for a quasi-independence are equal to (the degrees of
freedom for the model for the complete table) plus (the
number of parameters in the complete model that cannot be
estimated due to the pattern of isolated cells) minus (the
number of isolated cells). The computational difficulties
arise from the determination of the number of unestimable
parameters. OCTA assumes that this number is zero and
computes the degrees of freedom as (the degrees of freedom
for the complete model) minus (the number of isolated cells).
The results are labelled (quasi-independence) to remind the
user that they are computed subject to this rule.
ANALYZING MARGINAL TABLES AND STRATA
MARGIN forms marginal tables of the original table.
These tables may be subjected to further analysis by use of
the SAVE and TABLE commands. MARGIN may also be used to
permute the original table for display purposes by entering
the highest order interaction with the variables listed in
the order in which they are to appear in the display. The
first variable forms the columns of the display, the second
the rows, the third the fastest varying variable across
tables, and so on.
The STRATUM command allows a stratum defined by up to
all but one of the original variables to be displayed, SAVEd
and analyzed (TABLE).
OCTA G.E. Dallal
PAGE 6
DISPLAYING RESULTS
["Printing" refers to "displaying to screen". For hard
copy, use the Shift-PrtSc combination to print the contents
of the screen or the Ctrl-PrtSc combination to print all
subsequent information displayed on the screen.]
The SP (Set Print options) command lets the user choose
between three types of residuals (simple differences,
standardized residuals, and Freeman-Tukey deviates) and
display (1) a table of observed counts, expected counts, and
residuals, (2) a normal plot of the residuals, and (3) the
effects of the log-linear model along with their standard
errors (see below). The default is to print only the table
(1) with standardized residuals. The PRINT command generates
the display.
When a quasi-independence model is fitted, the observed
counts, expected counts, and residuals for isolated cells
appear as -99; the effects of the log-linear model are not
estimated or displayed.
OTHER OPTIONS
DELTA allows a constant (typically 0.5) to be added to
each cell count. Successive calls to DELTA are NOT
cumulative. All changes are based on the original cell
counts.
INITIAL allows a table of initial values to be
specified. Any interactions that are present in the table of
initial values will be present in the table of fitted values.
If a table of initial values contains anything but zeros and
a common nonzero constant (another method of specifying
quasi-independence models), only the expected values should
be used. Any test statistics coaxed out of OCTA do not
measure what they appear to measure.
INTERACTION BETWEEN QUASI, DELTA
AND OTHER COMMANDS
Requests for marginal tables or strata undo designations
of isolated cells but will not affect added constants.
OCTA G.E. Dallal
PAGE 7
Reentering the TABLE command will restore a table to its
original values, as will a call to DELTA with an increment of
0, but the TABLE command will undo designations of isolated
cells while DELTA with an increment of 0 will not.
The PM command removes designations of isolated cells
before it is carried out; the ALL command does not. Neither
command affects added constants.
Expected values and residuals are printed along with the
cell counts when the PRINT command is issued after a model
has been fitted. When several models have been fitted, the
expected values and residuals apply to the most recent model.
Only cell counts are printed when a PM, ALL, or ISOLATE
command intervenes between the MODEL and PRINT commands.
or when there is no prior MODEL command.
PERMISSION TO COPY
Individuals and not-for-profit organizations are granted
permission to freely copy this program and documentation
provided
-- no price is charged, and
-- the diskette, containing both program and
documentation, is not modified in any way.
BBS's and software libraries may add OCTA to their
collection upon receipt of written permission from the
author. Under no circumstances may OCTA be duplicated or
circulated as part of ANY OTHER commercial venture.
USER FEE
If you find OCTA to be of use to you, a user's fee of
$10 is requested. OCTA should be treated like a book: Any
number of individuals may use a single copy of OCTA on any
number of machines provided only one user is using it on one
machine at any one time.
OCTA G.E. Dallal
PAGE 8
ALGORITHMS
OCTA makes use of the following published routines:
Haberman, S.J. (1972). Algorithm AS 51: Log-linear fit for
contingency tables. Applied Statistics, 21, 218-225.
Hill, I.D. (1973). Algorithm AS 66. The normal integral.
Applied Statistics, 22, 424-427.
Lustbader, E.D. and Stodola, R.K. (1981). Algorithm AS 160:
Partial and marginal association in multidimensional
contingency tables. Applied Statistics, 30, 97-105.
Odeh, R.E. and J.O. Evans (1974). Algorithm AS 70. The
percentage points of the normal distribution. Applied
Statistics, 23, 96-97.
and the author's FORTRAN translation of
Pike, M.C. and I.D. Hill (1966). Algorithm 291. Logarithm
of the gamma function. Communications of the ACM, 9, 684.
REFERENCES
Bishop, Y.M.M., Fienberg, S.E., and Holland, P.W. (1975),
Discrete Multivariate Analysis: Theory and Practice,
Cambridge, MA: The MIT Press.
Brown, M.B. (1976), "Screening Effects in Multidimensional
Contingency Tables," Applied Statistics, 25, 37-46.
Fienberg, S.E. (1980), The Analysis of Cross-Classified
Categorical Data, 2nd ed, Cambridge, MA: The MIT Press.
Lee, S.K. (1977). "On the Asymptotic Variances of u Terms in
Loglinear Models of Multi-dimensional Contingency Tables,"
Journal of the American Statistical Association, 72, 412-
419.
OCTA G.E. Dallal